Skip to content

Correções de execução do markapi. #39

Open
gitnnolabs wants to merge 6 commits intomarkupappfrom
fix-markupapp
Open

Correções de execução do markapi. #39
gitnnolabs wants to merge 6 commits intomarkupappfrom
fix-markupapp

Conversation

@gitnnolabs
Copy link
Copy Markdown

O que esse PR faz?

Esse PR realiza ajustes de instalação e execução do markapi.

Onde a revisão poderia começar?

Sugiro que para realizar a revisão seja realizado os seguintes passos:

  1. Verificar os arquivos alterados nesse PR
  2. Baixar o projeto no git
  3. Mudar para a branch fix-markupapp
  4. Executar os passos descritos no link: https://github.com/scieloorg/markapi/wiki/Guia-de-Instala%C3%A7%C3%A3o-do-MarkAPI
  5. Executar a carga de um dos documentos incluídos na nova pasta fixtures para teste.
  6. Em alguns documentos é possível no log do celery works verificar alguns erros, esse bug esta notificados no issue: Erro na marcação de arquivos DOCx #38

Como este poderia ser testado manualmente?

Para realizar os testes de marcação de um documento manualmente é necessário seguir os seguintes passos: https://github.com/scieloorg/markapi/wiki/Converter-DOCX-para-XML, considere o documento e14790.docx na fixtures que está sem erro de execução.

Algum cenário de contexto que queira dar?

Removi a duplicidade de containers e tornei o modelo llama para marcação obrigatório.

Screenshots

Segue algumas telas de documentos marcados:

Screenshot 2025-11-30 at 10 25 04 Screenshot 2025-11-30 at 10 25 12 Screenshot 2025-11-30 at 10 25 19 Screenshot 2025-11-30 at 10 32 23 Screenshot 2025-11-30 at 10 37 04 Screenshot 2025-11-30 at 10 37 36

Quais são tickets relevantes?

Não existe tíquete para essa atividade.

Referências

N/A

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses installation and execution issues for markapi by consolidating llama dependencies into base requirements and making the llama model mandatory for document markup operations. The changes remove duplicate Docker configurations and add test fixtures for validation.

Key changes:

  • Moved llama-related dependencies from optional to required base dependencies
  • Changed LLAMA_ENABLED default to True, making llama model mandatory
  • Renamed database field from status to estatus in Reference model
  • Removed duplicate Docker configurations (llama.local.yml and Dockerfile.llama)
  • Added test fixture DOCX files for document processing validation

Reviewed changes

Copilot reviewed 7 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
requirements/base.txt Consolidated llama dependencies (huggingface-hub, llama-cpp-python) into base requirements
requirements/extra-llama.txt Removed file as dependencies moved to base
config/settings/base.py Changed LLAMA_ENABLED default from False to True
reference/models.py Renamed field from status to estatus
reference/wagtail_hooks.py Updated field reference to match rename
llama.local.yml Removed duplicate Docker Compose configuration
compose/local/django/Dockerfile.llama Removed specialized Dockerfile
fixtures/e740.docx Added test fixture (binary file)
fixtures/e14790.docx Added test fixture (binary file)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

new_reference = Reference.objects.create(
mixed_citation=linea,
status=1, # Estatus predeterminado
estatus=1, # Estatus predeterminado
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field rename from status=1 to estatus=1 is a breaking change that requires a database migration. Ensure that a migration file has been created and tested to handle this field rename, otherwise existing database records will be inaccessible.

Copilot uses AI. Check for mistakes.

# LLAMA
LLAMA_ENABLED = env.bool("LLAMA_ENABLED", default=False)
LLAMA_ENABLED = env.bool("LLAMA_ENABLED", default=True)
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing LLAMA_ENABLED default from False to True makes the llama model mandatory for all deployments. This is a significant behavioral change that:

  1. Requires the llama model file to be present in all environments
  2. May impact existing deployments that don't have llama configured
  3. Should be clearly documented in deployment/upgrade instructions

Consider whether this should be an opt-in change rather than changing the default, or ensure proper deployment documentation is updated.

Suggested change
LLAMA_ENABLED = env.bool("LLAMA_ENABLED", default=True)
LLAMA_ENABLED = env.bool("LLAMA_ENABLED", default=False)

Copilot uses AI. Check for mistakes.
mixed_citation = models.TextField(_("Mixed Citation"), null=False, blank=True)

status = models.IntegerField(
estatus = models.IntegerField(
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field name change from status to estatus appears inconsistent with the model name ReferenceStatus and the translation _("Reference status"). This creates confusion as the field name suggests Spanish while the related choices class and translation key use English. Consider either:

  1. Keeping the field name as status to match the choices class name
  2. Or changing both to use consistent Spanish naming (e.g., ReferenceEstatus choices class)
Suggested change
estatus = models.IntegerField(
status = models.IntegerField(

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 10 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

reference/models.py:29

  • Renaming Reference.status to Reference.estatus requires a schema migration (ideally a RenameField migration to preserve data). Right now existing migrations and code still reference status (e.g., reference/migrations/0001_initial.py, reference/api/v1/views.py, reference/data_utils.py), which will break at runtime after migrating. Add/update migrations and update all remaining status references to estatus (or keep the field name as status if the rename is not intended).
class Reference(CommonControlField, ClusterableModel):
    mixed_citation = models.TextField(_("Mixed Citation"), null=False, blank=True)

    estatus = models.IntegerField(
        _("Reference status"),
        choices=ReferenceStatus.choices,
        blank=True,
        default=ReferenceStatus.NO_REFERENCE
    )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +55 to +61
# huggingface-hub
# ------------------------------------------------------------------------------
huggingface_hub==0.26.1 # https://pypi.org/project/huggingface-hub/

# Python Bindings for llama.cpp
# ------------------------------------------------------------------------------
llama-cpp-python==0.3.14 # https://llama-cpp-python.readthedocs.io/en/latest/
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huggingface_hub is not the pip package name (PyPI uses huggingface-hub). As written, installs will fail with “No matching distribution found”. Rename the requirement entry back to huggingface-hub==0.26.1 (the import module is huggingface_hub, but the distribution name uses a hyphen).

Copilot uses AI. Check for mistakes.
Comment on lines 26 to 30
new_reference = Reference.objects.create(
mixed_citation=linea,
status=1, # Estatus predeterminado
estatus=1, # Estatus predeterminado
creator=self.request.user, # Usuario asociado
)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates references with a hard-coded estatus=1. Since ReferenceStatus exists and is used elsewhere, prefer using the enum value (e.g., ReferenceStatus.CREATING) to avoid magic numbers and keep behavior stable if the enum changes.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 10 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

reference/models.py:29

  • O campo do modelo foi renomeado de status para estatus, mas as migrações existentes (ex.: reference/migrations/0001_initial.py) ainda criam a coluna status e há usos do atributo status em outras partes do app (ex.: reference/data_utils.py, reference/api/v1/views.py). Isso vai quebrar em runtime (AttributeError) e/ou na migração do schema. Sugestão: manter o nome status no model ou fazer um RenameField via migration e atualizar todos os acessos/serializers/queries para o novo nome de forma consistente.
    estatus = models.IntegerField(
        _("Reference status"),
        choices=ReferenceStatus.choices,
        blank=True,
        default=ReferenceStatus.NO_REFERENCE
    )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

new_reference = Reference.objects.create(
mixed_citation=linea,
status=1, # Estatus predeterminado
estatus=1, # Estatus predeterminado
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aqui foi trocado para estatus=1, mas o restante do app ainda usa status (e as migrações criam status). Isso vai falhar ao criar o objeto via Wagtail. Sugestão: usar status=ReferenceStatus.CREATING (ou equivalente) ou alinhar a renomeação do campo com migration + atualização completa do código.

Suggested change
estatus=1, # Estatus predeterminado
status=1, # Estatus predeterminado

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants